14 research outputs found
An Attention-driven Hierarchical Multi-scale Representation for Visual Recognition
Convolutional Neural Networks (CNNs) have revolutionized the understanding of
visual content. This is mainly due to their ability to break down an image into
smaller pieces, extract multi-scale localized features and compose them to
construct highly expressive representations for decision making. However, the
convolution operation is unable to capture long-range dependencies such as
arbitrary relations between pixels since it operates on a fixed-size window.
Therefore, it may not be suitable for discriminating subtle changes (e.g.
fine-grained visual recognition). To this end, our proposed method captures the
high-level long-range dependencies by exploring Graph Convolutional Networks
(GCNs), which aggregate information by establishing relationships among
multi-scale hierarchical regions. These regions consist of smaller (closer
look) to larger (far look), and the dependency between regions is modeled by an
innovative attention-driven message propagation, guided by the graph structure
to emphasize the neighborhoods of a given region. Our approach is simple yet
extremely effective in solving both the fine-grained and generic visual
classification problems. It outperforms the state-of-the-arts with a
significant margin on three and is very competitive on other two datasets.Comment: Accepted in the 32nd British Machine Vision Conference (BMVC) 202
SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization
Over the past few years, a significant progress has been made in deep
convolutional neural networks (CNNs)-based image recognition. This is mainly
due to the strong ability of such networks in mining discriminative object pose
and parts information from texture and shape. This is often inappropriate for
fine-grained visual classification (FGVC) since it exhibits high intra-class
and low inter-class variances due to occlusions, deformation, illuminations,
etc. Thus, an expressive feature representation describing global structural
information is a key to characterize an object/ scene. To this end, we propose
a method that effectively captures subtle changes by aggregating context-aware
features from most relevant image-regions and their importance in
discriminating fine-grained categories avoiding the bounding-box and/or
distinguishable part annotations. Our approach is inspired by the recent
advancement in self-attention and graph neural networks (GNNs) approaches to
include a simple yet effective relation-aware feature transformation and its
refinement using a context-aware attention mechanism to boost the
discriminability of the transformed feature in an end-to-end learning process.
Our model is evaluated on eight benchmark datasets consisting of fine-grained
objects and human-object interactions. It outperforms the state-of-the-art
approaches by a significant margin in recognition accuracy.Comment: Accepted manuscript - IEEE Transaction on Image Processin
Attend and Guide (AG-Net): A Keypoints-driven Attention-based Deep Network for Image Recognition
This paper presents a novel keypoints-based attention mechanism for visual
recognition in still images. Deep Convolutional Neural Networks (CNNs) for
recognizing images with distinctive classes have shown great success, but their
performance in discriminating fine-grained changes is not at the same level. We
address this by proposing an end-to-end CNN model, which learns meaningful
features linking fine-grained changes using our novel attention mechanism. It
captures the spatial structures in images by identifying semantic regions (SRs)
and their spatial distributions, and is proved to be the key to modelling
subtle changes in images. We automatically identify these SRs by grouping the
detected keypoints in a given image. The ``usefulness'' of these SRs for image
recognition is measured using our innovative attentional mechanism focusing on
parts of the image that are most relevant to a given task. This framework
applies to traditional and fine-grained image recognition tasks and does not
require manually annotated regions (e.g. bounding-box of body parts, objects,
etc.) for learning and prediction. Moreover, the proposed keypoints-driven
attention mechanism can be easily integrated into the existing CNN models. The
framework is evaluated on six diverse benchmark datasets. The model outperforms
the state-of-the-art approaches by a considerable margin using Distracted
Driver V1 (Acc: 3.39%), Distracted Driver V2 (Acc: 6.58%), Stanford-40 Actions
(mAP: 2.15%), People Playing Musical Instruments (mAP: 16.05%), Food-101 (Acc:
6.30%) and Caltech-256 (Acc: 2.59%) datasets.Comment: Published in IEEE Transaction on Image Processing 2021, Vol. 30, pp.
3691 - 370
SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization
Over the past few years, a significant progress has been made in deep
convolutional neural networks (CNNs)-based image recognition. This is mainly
due to the strong ability of such networks in mining discriminative object pose
and parts information from texture and shape. This is often inappropriate for
fine-grained visual classification (FGVC) since it exhibits high intra-class
and low inter-class variances due to occlusions, deformation, illuminations,
etc. Thus, an expressive feature representation describing global structural
information is a key to characterize an object/ scene. To this end, we propose
a method that effectively captures subtle changes by aggregating context-aware
features from most relevant image-regions and their importance in
discriminating fine-grained categories avoiding the bounding-box and/or
distinguishable part annotations. Our approach is inspired by the recent
advancement in self-attention and graph neural networks (GNNs) approaches to
include a simple yet effective relation-aware feature transformation and its
refinement using a context-aware attention mechanism to boost the
discriminability of the transformed feature in an end-to-end learning process.
Our model is evaluated on eight benchmark datasets consisting of fine-grained
objects and human-object interactions. It outperforms the state-of-the-art
approaches by a significant margin in recognition accuracy.Comment: Accepted manuscript - IEEE Transaction on Image Processin
Context-aware Attentional Pooling (CAP) for Fine-grained Visual Classification
Deep convolutional neural networks (CNNs) have shown a strong ability in
mining discriminative object pose and parts information for image recognition.
For fine-grained recognition, context-aware rich feature representation of
object/scene plays a key role since it exhibits a significant variance in the
same subcategory and subtle variance among different subcategories. Finding the
subtle variance that fully characterizes the object/scene is not
straightforward. To address this, we propose a novel context-aware attentional
pooling (CAP) that effectively captures subtle changes via sub-pixel gradients,
and learns to attend informative integral regions and their importance in
discriminating different subcategories without requiring the bounding-box
and/or distinguishable part annotations. We also introduce a novel feature
encoding by considering the intrinsic consistency between the informativeness
of the integral regions and their spatial structures to capture the semantic
correlation among them. Our approach is simple yet extremely effective and can
be easily applied on top of a standard classification backbone network. We
evaluate our approach using six state-of-the-art (SotA) backbone networks and
eight benchmark datasets. Our method significantly outperforms the SotA
approaches on six datasets and is very competitive with the remaining two.Comment: Extended version of the accepted paper in 35th AAAI Conference on
Artificial Intelligence 202
Deep Ear Biometrics for Gender Classification
Human gender classification based on biometric features is a major concern
for computer vision due to its vast variety of applications. The human ear is
popular among researchers as a soft biometric trait, because it is less
affected by age or changing circumstances, and is non-intrusive. In this study,
we have developed a deep convolutional neural network (CNN) model for automatic
gender classification using the samples of ear images. The performance is
evaluated using four cutting-edge pre-trained CNN models. In terms of trainable
parameters, the proposed technique requires significantly less computational
complexity. The proposed model has achieved 93% accuracy on the EarVN1.0 ear
dataset.Comment: 10 pages, 4 figures, 2 table
Two-stage human verification using HandCAPTCHA and anti-spoofed finger biometrics with feature selection
This paper presents a human verification scheme in two independent stages to overcome the vulnerabilities of attacks and to enhance security. At the first stage, a hand image-based CAPTCHA (HandCAPTCHA) is tested to avert automated bot-attacks on the subsequent biometric stage. In the next stage, finger biometric verification of a legitimate user is performed with presentation attack detection (PAD) using the real hand images of the person who has passed a random HandCAPTCHA challenge. The electronic screen-based PAD is tested using image quality metrics. After this spoofing detection, geometric features are extracted from the four fingers (excluding the thumb) of real users. A modified forward–backward (M-FoBa) algorithm is devised to select relevant features for biometric authentication. The experiments are performed on the Boğaziçi University (BU) and the IIT-Delhi (IITD) hand databases using the k-nearest neighbor and random forest classifiers. The average accuracy of the correct HandCAPTCHA solution is 98.5%, and the false accept rate of a bot is 1.23%. The PAD is tested on 255 subjects of BU, and the best average error is 0%. The finger biometric identification accuracy of 98% and an equal error rate (EER) of 6.5% have been achieved for 500 subjects of the BU. For 200 subjects of the IITD, 99.5% identification accuracy, and 5.18% EER are obtained
Isolation and Characterisation of the Lipopolysaccharide From Acidiphilium Strain GS18h/ATCC55963, a Soil Isolate of Indian Copper Mine
The lipopolysaccharide (LPS) of the Gram-negative Acidiphilium strain GS18h/ATCC55963, a new soil isolate, exhibited very low endotoxic activity as determined by Limulus gelation activity, lethal toxicity in galactosamine (GalN) sensitised mice, and level of tumor necrosis factor alpha (TNFa) in the blood serum of BALB/c mice. Analysis of the LPS, specially of lipid A which usually accounts for the toxicity, revealed the latter to contain glucosamine and phosphate besides fatty acids, of which 14:0(3-OH), 18:0(3-OH), 18:1 and 19:0(cyclo) are the major components, while 12:0, 16:0, 19:1, 20:0(3-OH) and 20:1(3-OH) are present in small amounts. The 14:0(3-OH) and 18:0(3-OH) fatty acids are amide-linked, whereas the rest are ester bound. Glucose, galactose, mannose, rhamnose, heptose, galacturonic acid and 3-deoxy-D-manno-oct-2-ulosonic acid
(Kdo) were present in the polysaccharide part of this LPS. Sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS– PAGE) of the LPS showed a macromolecular heterogeneity distinctly different from those of Escherichia coli or Salmonella. The toxicity of this LPS being extremely low attributed to fatty acid composition of its lipid A, promises potential therapeutic application